-
Notifications
You must be signed in to change notification settings - Fork 573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/snap-update-ns: keep order of existing mount entries #14494
base: master
Are you sure you want to change the base?
cmd/snap-update-ns: keep order of existing mount entries #14494
Conversation
Snapd is affected by several bugs related to content-layout interaction. There's a very popular scheme where a snap has two interacting elements that only work in one special edge case and break regularly: - A content plug is used to receive content of another snap, typically under $SNAP/foo somewhere. Where foo exists and is not created with a mimic. - A layout entry is used to "distribute" $SNAP/foo to /usr/share/foo or other locations. This particular arrangement is used to share essential libraries from a common "runtime" snap which is not a dedicated base. Several basic tests explore the actual behavior in the following situations. - The content is initially connected and a mount namespace is constructed. This is the only correct behavior show in this group of tests. This is known as test case 1. - The content is initially connected and a mount namespace is constructed. The content is then disconnected. This shows several anomalies that are referenced in the code (ordering bug, propagation bug). This is known as test case 2. - The content is initially connected and a mount namespace is constructed. The content is then disconnected and re-connected. This is known as test case 3. - The content is initially disconnected and a mount namespace is constructed. The content is then connected. This is known as test case 4. - The content interface is initially connected and a mount namespace is constructed. The content snap is then refreshed. This is known as test case 5. - The content interface is initially connected and a mount namespace is constructed. The application snap is then refreshed. This is known as test case 6. This is related to issues: - https://warthogs.atlassian.net/browse/SNAPDENG-31644 (flip flop) - https://warthogs.atlassian.net/browse/SNAPDENG-31645 (propagation) A support makefile and a set of data files is provided to re-generate the test data once snapd is improved. Signed-off-by: Zygmunt Krynicki <[email protected]>
Signed-off-by: Zygmunt Krynicki <[email protected]>
Signed-off-by: Zygmunt Krynicki <[email protected]>
We recently realized that the order of saved mount profile, after performing mount namespace update, was subtly incorrect. The order was seemingly swapped after each operation. In reality, the order reflected the order of performed changes, including the special change type of "keep", which means that an entry is kept intact. Fix the problem by storing "kept" changes in the original order, which is back-to-front compared to the order of changes computed by the algorithm. At the same time, keep track of mount changes in the order of their execution. This probably fixes a hell lot of edge cases and weird behaviors that went undiagnosed for many years. Interestingly a test case had a TODO that hinted at the problem, but it was not clear what the problem was at the time. The TODO is now gone and the test case is now up-to-date. Test data for cmd/snap-update-ns/testdata was re-generated. Jira: https://warthogs.atlassian.net/browse/SNAPDENG-31644 Signed-off-by: Zygmunt Krynicki <[email protected]>
15335e7
to
aaceabb
Compare
@@ -65,6 +65,10 @@ type Change struct { | |||
Action Action | |||
} | |||
|
|||
func (c *Change) GoString() string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's useful for printing changes. It's not use here but really valuable in practice.
cmd/snap-update-ns/update.go
Outdated
@@ -93,8 +93,14 @@ func executeMountProfileUpdate(upCtx MountProfileUpdateContext) error { | |||
// Compute the new current profile so that it contains only changes that were made | |||
// and save it back for next runs. | |||
var currentAfter osutil.MountProfile | |||
for i := len(changesMade) - 1; i >= 0; i-- { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's useful to leave a comment for ourselves that the initial state was initially iterated over from back to front, so the things we keep are also present in reverse order. Thus order to preserve the stable state for kept entries we should record them in reverse order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll update this to make it clearer what is going on and add an explicit test. Expect this in the rebase after the base branch lands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed now
The new function takes the previously existing algorithm out of executeMountProfileUpdate, as the original location makes it very tricky to test due to the presence of all the system interactions through change.Perform. Add some simple tests that show how keep, mount and unmount changes are preserved. Signed-off-by: Zygmunt Krynicki <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as I said looking at how NeededChanges works at the moment it appears this will mostly affect the order of unmounting of things we kept that we cannot keep anymore the next time around, it looks the right direction but it's hard to predict the general effects, as the logic in NeededChanges is helpful to an extent but as we know has limitations and is hard to predict atm
return upCtx.SaveCurrentProfile(¤tAfter) | ||
} | ||
|
||
func CurrentProfileFromChangesMade(changes []*Change) osutil.MountProfile { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this needs a comment of its own
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Signed-off-by: Zygmunt Krynicki <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
This is stacked on top of #14491
We recently realized that the order of saved mount profile, after
performing mount namespace update, was subtly incorrect. The order was
seemingly swapped after each operation. In reality, the order reflected
the order of performed changes, including the special change type of
"keep", which means that an entry is kept intact.
Fix the problem by storing "kept" changes in the original order, which
is back-to-front compared to the order of changes computed by the
algorithm. At the same time, keep track of mount changes in the order of
their execution.
This probably fixes a hell lot of edge cases and weird behaviors that
went undiagnosed for many years. Interestingly a test case had a TODO
that hinted at the problem, but it was not clear what the problem was at
the time. The TODO is now gone and the test case is now up-to-date.
Test data for cmd/snap-update-ns/testdata was re-generated.
Jira: https://warthogs.atlassian.net/browse/SNAPDENG-31644
Signed-off-by: Zygmunt Krynicki [email protected]